External-Memory Search Trees with Fast Insertions

نویسندگان

  • Jelani Nelson
  • Charles E. Leiserson
  • Bradley C. Kuszmaul
  • Arthur C. Smith
چکیده

This thesis provides both experimental and theoretical contributions regarding externalmemory dynamic search trees with fast insertions. The first contribution is the implementation of the buffered repository B-tree, a data structure that provably outperforms B-trees for updates at the cost of a constant factor decrease in query performance. This thesis also describes the cache-oblivious lookahead array, which outperforms B-trees for updates at a logarithmic cost in query performance, and does so without knowing the cache parameters of the system it is being run on. The buffered repository B-tree is an external-memory search tree that can be tuned for a tradeoff between queries and updates. Specifically, for any ǫ ∈ [1/ lgB, 1] this data structure achieves O((1/ǫB)(1+ logB(N/B))) block transfers for Insert and Delete and O((1/ǫ)(1 + logB(N/B))) block transfers for Search. The update complexity is amortized and is O((1/ǫ)(1+ logB(N/B))) in the worst case. Using the value ǫ = 1/2, I was able to achieve a 17 times increase in insertion performance at the cost of only a 3 times decrease in search performance on a database with 12-byte items on a disk with a 4-kilobyte block size. This thesis also shows how to build a cache-oblivious data structure, the cacheoblivious lookahead array, which achieves the same bounds as the buffered repository B-tree in the case where ǫ = 1/ lgB. Specifically, it achieves an update complexity of O((1/B) log(N/B)) and a query complexity of O(log(N/B)) block transfers. This is the first data structure to achieve these bounds cache-obliviously. The research involving the cache-oblivious lookahead array represents joint work with Michael A. Bender, Jeremy Fineman, and Bradley C. Kuszmaul. Thesis Supervisor: Charles E. Leiserson Title: Professor of Computer Science and Engineering Thesis Supervisor: Bradley C. Kuszmaul Title: Research Scientist

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Persistent B+-Trees in Non-Volatile Main Memory

Computer systems in the near future are expected to have NonVolatile Main Memory (NVMM), enabled by a new generation of Non-Volatile Memory (NVM) technologies, such as Phase Change Memory (PCM), STT-MRAM, and Memristor. The non-volatility property has the promise to persist in-memory data structures for instantaneous failure recovery. However, realizing such promise requires a careful design to...

متن کامل

Efficient cross-trees for external memory

We describe eecient methods for organizing and maintaining large multidimensional data sets in external memory. This is particular important as access to external memory is currently several order of magnitudes slower than access to main memory, and current technology advances are likely to make this gap even wider. We focus particularly on multidimensional data sets which must be kept simultan...

متن کامل

New Combinatorial Properties and Algorithms for AVL Trees

In this thesis, new properties of AVL trees and a new partitioning of binary search trees named core partitioning scheme are discussed, this scheme is applied to three binary search trees namely AVL trees, weight-balanced trees, and plain binary search trees. We introduce the core partitioning scheme, which maintains a balanced search tree as a dynamic collection of complete balanced binary tre...

متن کامل

Probabilistic analysis of the asymmetric digital search trees

In this paper, by applying three functional operators the previous results on the (Poisson) variance of the external profile in digital search trees will be improved. We study the profile built over $n$ binary strings generated by a memoryless source with unequal probabilities of symbols and use a combinatorial approach for studying the Poissonized variance, since the probability distribution o...

متن کامل

Obtaining Provably Good Performance from Suffix Trees in Secondary Storage

Designing external memory data structures for string databases is of significant recent interest due to the proliferation of biological sequence data. The suffix tree is an important indexing structure that provides optimal algorithms for memory bound data. However, string Btrees provide the best known asymptotic performance in external memory for substring search and update operations. Work on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006